Picture for Tiancheng Zhao

Tiancheng Zhao

OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification

Add code
Apr 29, 2025
Viaarxiv icon

SRMF: A Data Augmentation and Multimodal Fusion Approach for Long-Tail UHR Satellite Image Segmentation

Add code
Apr 28, 2025
Viaarxiv icon

VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model

Add code
Apr 10, 2025
Viaarxiv icon

GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing

Add code
Mar 16, 2025
Viaarxiv icon

The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?

Add code
Feb 19, 2025
Viaarxiv icon

GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent

Add code
Dec 24, 2024
Figure 1 for GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent
Figure 2 for GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent
Figure 3 for GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent
Figure 4 for GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent
Viaarxiv icon

Evaluating and Enhancing LLMs for Multi-turn Text-to-SQL with Multiple Question Types

Add code
Dec 21, 2024
Viaarxiv icon

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

Add code
Nov 25, 2024
Figure 1 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 2 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 3 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Figure 4 for ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Viaarxiv icon

Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG

Add code
Nov 12, 2024
Figure 1 for Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG
Figure 2 for Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG
Figure 3 for Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG
Viaarxiv icon

OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

Add code
Jul 06, 2024
Figure 1 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 2 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 3 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Figure 4 for OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Viaarxiv icon